Department of Biostatistics, Johns Hopkins School of Public Health
For each second and each person:
Obtain joint distribution of acceleration and lag acceleration for a series of lags
Calculate scalar summaries of the joint distribution
I will walk through the process for one second, one person, and one lag
Intuition: walking is cyclic process. We want to leverage cyclic nature of walking.
Hat tip to Edward Gunning for the idea for these figures
Hat tip to Edward Gunning for the idea for these figures
Hat tip to Edward Gunning for the idea for these figures
Hat tip to Edward Gunning for the idea for these figures
Toy example: 4 observations per second, 2 seconds, 1 individual
\(v_1(2)\): 2nd acceleration observation in second 1
data \[\begin{bmatrix} v_1(1) & v_1(2) & v_1(3) & v_1(4) \\ v_2(1) & v_2(2) & v_2(3) & v_2(4) \\ \end{bmatrix} \]
Toy example: 4 observations per second, 2 seconds, 1 individual
\(v_1(2)\): 2nd acceleration observation in second 1
data \[\begin{bmatrix} v_1(1) & v_1(2) & v_1(3) & v_1(4) \\ v_2(1) & v_2(2) & v_2(3) & v_2(4) \\ \end{bmatrix} \]
acceleration matrix \[\begin{bmatrix} v_1(2) & v_1(3) & v_1(4) & v_1(3) & v_1(4) & v_1(4) \\ v_2(2) & v_2(3) & v_2(4) & v_2(3) & v_2(4) & v_2(4) \\ \end{bmatrix} \]
Toy example: 4 observations per second, 2 seconds, 1 individual
\(v_1(2)\): 2nd acceleration observation in second 1
data \[\begin{bmatrix} v_1(1) & v_1(2) & v_1(3) & v_1(4) \\ v_2(1) & v_2(2) & v_2(3) & v_2(4) \\ \end{bmatrix} \]
acceleration matrix \[\begin{bmatrix} v_1(2) & v_1(3) & v_1(4) & v_1(3) & v_1(4) & v_1(4) \\ v_2(2) & v_2(3) & v_2(4) & v_2(3) & v_2(4) & v_2(4) \\ \end{bmatrix} \] lag acceleration matrix \[\begin{bmatrix} v_1(1) & v_1(1) & v_1(1) & v_1(2) & v_1(2) & v_1(3) \\ v_2(1) & v_2(1) & v_2(1) & v_2(2) & v_2(2) & v_2(3) \\ \end{bmatrix} \]
Toy example: 4 observations per second, 2 seconds, 1 individual
\(v_1(2)\): 2nd acceleration observation in second 1
data \[\begin{bmatrix} v_1(1) & v_1(2) & v_1(3) & v_1(4) \\ v_2(1) & v_2(2) & v_2(3) & v_2(4) \\ \end{bmatrix} \]
acceleration matrix \[\begin{bmatrix} v_1(2) & v_1(3) & v_1(4) & v_1(3) & v_1(4) & v_1(4) \\ v_2(2) & v_2(3) & v_2(4) & v_2(3) & v_2(4) & v_2(4) \\ \end{bmatrix} \] lag acceleration matrix \[\begin{bmatrix} v_1(1) & v_1(1) & v_1(1) & v_1(2) & v_1(2) & v_1(3) \\ v_2(1) & v_2(1) & v_2(1) & v_2(2) & v_2(2) & v_2(3) \\ \end{bmatrix} \]
lag matrix \[\begin{bmatrix} 1 & 2 & 3 & 1 & 2 & 1\\ 1 & 2 & 3 & 1 & 2 & 1\\\end{bmatrix} \]
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model:
\[\text{logit}(p_{ij}^{i_0}) =\beta_0^{i_0} + \int_{u=1}^S\int_{s=u}^SF_{i_0}\{ v_{ij}(s), v_{ij}(s-u), u\}dsdu \]
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model:
\[\text{logit}(p_{ij}^{i_0}) =\beta_0^{i_0} + \int_{u=1}^S\int_{s=u}^SF_{i_0}\{ v_{ij}(s), v_{ij}(s-u), u\}dsdu \]
\(u = 1, \dots, S = 100\) (number of observations per second)
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model:
\[\text{logit}(p_{ij}^{i_0}) =\beta_0^{i_0} + \int_{u=1}^S\int_{s=u}^SF_{i_0}\{ v_{ij}(s), v_{ij}(s-u), u\}dsdu \]
\(u = 1, \dots, S = 100\) (number of observations per second)
\(v_{ij}(s)\) = acceleration at centisecond \(s\) for subject \(i\) in second \(j\)
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model:
\[\text{logit}(p_{ij}^{i_0}) =\beta_0^{i_0} + \int_{u=1}^S\int_{s=u}^SF_{i_0}\{ v_{ij}(s), v_{ij}(s-u), u\}dsdu \]
\(u = 1, \dots, S = 100\) (number of observations per second)
\(v_{ij}(s)\) = acceleration at centisecond \(s\) for subject \(i\) in second \(j\)
\(F(\cdot, \cdot, \cdot)\): trivariate smooth function, takes values at every point in the domain of acceleration, lag acceleration, lags
Model outcomes as:
\[Y_{ij}^{i_0}\sim\text{Bernoulli}(p_{ij}^{i_0})\]
where \(Y_{ij}^{i_0} = 1\) if subject \(i\) in second \(j\) belongs to subject \(i_0\), and 0 otherwise
Model:
\[\text{logit}(p_{ij}^{i_0}) =\beta_0^{i_0} + \int_{u=1}^S\int_{s=u}^SF_{i_0}\{ v_{ij}(s), v_{ij}(s-u), u\}dsdu \]
\(u = 1, \dots, S = 100\) (number of observations per second)
\(v_{ij}(s)\) = acceleration at centisecond \(s\) for subject \(i\) in second \(j\)
\(F(\cdot, \cdot, \cdot)\): trivariate smooth function, takes values at every point in the domain of acceleration, lag acceleration, lags
Fit using penalized splines with a quadratic penalty on the functional coefficient (Wood 2016)
\(\texttt{te()}\): tensor product smooth
\(\texttt{k = c(5, 5, 5)}\) number of basis functions for each dimension of the tensor product smooth
\(\texttt{weight\_mat}\): matrix of weights of linear functionals of smooth terms. We use equal weights so the \(i,j^{\mathrm{th}}\) entry is \(\texttt{1/nrow(accel\_mat)}\)
\(\texttt{method="REML"}\): smoothing parameter selection with restricted maximum likelihood
Rank-1 (rank-5) % accuracies
153 person dataset
3 min of walking seach
Two sessions at least 1 week apart
5 open-source algorithms, 3 datasets with gold-standard step counts
How many steps does the average American take per day?
Do estimates differ by algorithm?
Are more steps associated with lower mortality risk?
Do males take more steps than females? At what points during the day?
Implementation: Fast univariate inference (FUI) Cui et al. (2021) \[\mathbb{E}[\mathrm{steps}_i(s)] = \beta_0(s) + \beta_1(s)\mathrm{gender}_i + \beta_2(s)\mathrm{age}_i \]
\(i\): participant
\(s \in \{1, \dots, 1440\}\): each minute of the day
Fit separate GLM at each point \(s\) and smooth the resulting point estimates to get estimated effect of age, sex on steps profile
Bootstrap subjects to get confidence bands
Implementation: Fast univariate inference (FUI) Cui et al. (2021) \[\mathbb{E}[\mathrm{steps}_i(s)] = \beta_0(s) + \beta_1(s)\mathrm{gender}_i + \beta_2(s)\mathrm{age}_i \]
\(i\): participant
\(s \in \{1, \dots, 1440\}\): each minute of the day
Fit separate GLM at each point \(s\) and smooth the resulting point estimates to get estimated effect of age, sex on steps profile
Bootstrap subjects to get confidence bands
BUT: NHANES is not a simple random sample
Individuals are sampled in geographic clusters
Minority groups are oversampled
Are our estimates valid for population-level inference?
Implementation: Fast univariate inference (FUI) Cui et al. (2021) \[\mathbb{E}[\mathrm{steps}_i(s)] = \beta_0(s) + \beta_1(s)\mathrm{gender}_i + \beta_2(s)\mathrm{age}_i \]
\(i\): participant
\(s \in \{1, \dots, 1440\}\): each minute of the day
Fit separate GLM at each point \(s\) and smooth the resulting point estimates to get estimated effect of age, sex on steps profile
Bootstrap subjects to get confidence bands
BUT: NHANES is not a simple random sample
Individuals are sampled in geographic clusters
Minority groups are oversampled
Are our estimates valid for population-level inference?
For standard regression: \(\texttt{svyglm}\), \(\texttt{svycoxph}\)
Implementation: Fast univariate inference (FUI) Cui et al. (2021) \[\mathbb{E}[\mathrm{steps}_i(s)] = \beta_0(s) + \beta_1(s)\mathrm{gender}_i + \beta_2(s)\mathrm{age}_i \]
\(i\): participant
\(s \in \{1, \dots, 1440\}\): each minute of the day
Fit separate GLM at each point \(s\) and smooth the resulting point estimates to get estimated effect of age, sex on steps profile
Bootstrap subjects to get confidence bands
BUT: NHANES is not a simple random sample
Individuals are sampled in geographic clusters
Minority groups are oversampled
Are our estimates valid for population-level inference?
For standard regression: \(\texttt{svyglm}\), \(\texttt{svycoxph}\)
For functional regression: ?
\[\mathbb{E}[\mathrm{steps}_i(s)] = \beta_0(s) + \beta_1(s)\mathrm{gender}_i + \beta_2(s)\mathrm{age}_i \]
\(i\): participant
\(s \in \{1, \dots, 1440\}\): each minute of the day
\(\beta_0(s)\): mean steps over the course of the day taken for males age 0
\(\beta_1(s)\): how many additional steps do females take compared to males, over the course of the day, controlling for age?